巴西专利BR112012028272B1 method of reproducing a stereophonic sound, stereophonic sound reproduction equipment, and non-trans

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
METHOD OF PLAYING A STEREOphonic SOUND, STEREOphonic SOUND PLAYING EQUIPMENT, AND NON TRANSIENT COMPUTER-READABLE RECORDING MEDIA Method and equipment reproduce a stereophonic sound. The method includes obtaining sound depth information that denotes a distance between at least one object within a sound signal and a reference position, provides sound perspective for the sound object emitted from a speaker, based on the sound depth information.
公开号:BR112012028272B1
申请号:R112012028272-7
申请日:2011-05-04
公开日:2021-07-06
发明作者:Sun-min Kim
申请人:Electronics Co., Ltd.；
IPC主号:

专利说明:

TECHNICAL FIELD
[0001] Equipment and methods consistent with exemplary modalities refer to the reproduction of a stereophonic sound and, more specifically, to the reproduction of a stereophonic sound, in which perspective is given to a sound object. FUNDAMENTALS OF THE TECHNIQUE
[0002] With the development of video technology, users can now view stereoscopic three-dimensional (3D) images. By using various methods such as, for example, a binocular parallax method, a stereoscopic 3D image exposes left viewpoint image data for a left eye, and right viewpoint image data for a right eye. The user can thus perceive an object advancing off a screen or an object returning to the screen in a realistic way using 3D video technology.
[0003] On the other hand, stereophonic sound technology can enable the user to detect the location and presence of sounds by placing several speakers around the user. However, with the related technique stereophonic sound technology, a sound associated with the image object approaching the user or moving away from the user cannot be effectively expressed and thus sound effects that correspond to a stereoscopic image cannot can be provided. DISCLOSURE OF THE INVENTION SOLUTION TO THE PROBLEM
[0004] Exemplary modalities can address at least the problems and/or address the above mentioned disadvantages and other disadvantages not described above. Furthermore, exemplary modalities are not required to overcome the disadvantages described above, and an exemplary modality may not overcome any of the problems described above.
[0005] One or more exemplary modalities provide methods and equipment to effectively reproduce stereophonic sound, and more particularly, methods and equipment to effectively express sounds that approach the user or move away from the user by providing perspective to a sound object. ADVANTAGEOUS EFFECTS OF THE INVENTION
[0006] According to the related technique, it is difficult to obtain depth information because depth information of an image object must be provided as additional information or because depth information of an image object needs to be obtained by analyzing the data from Image. However, according to an exemplary embodiment, based on the fact that information about a position of the image object can be included in a sound signal, depth information is generated by analyzing a sound signal. Thus, the depth information of an image object can be easily obtained.
[0007] Furthermore, according to the related technique, phenomena such as an image object advancing from a screen or returning to the screen are not properly expressed using a beep. However, according to an exemplary modality, through expression and sound objects that are generated as an image object that are projected or returned to a screen, the user can detect a more realistic stereo effect.
[0008] Furthermore, according to an exemplary modality, a distance between the position where the sound object is generated and a reference position can be effectively expressed. Particularly, as perspective is given to each sound object, the user can effectively detect a stereo sound effect.
[0009] Exemplary modalities can be incorporated as computer programs and can be implemented in general purpose digital computers that run the programs using a computer readable recording medium.
[00010] Examples of computer readable recording media include storage media such as, for example, magnetic storage media (eg ROM, floppy disks, hard disks, etc.) and optical recording media (eg CD- ROMs, or DVDs).
[00011] The foregoing exemplary modalities and advantages are exemplary only and should not be considered as limiting. The present teaching can be readily applied to other types of equipment. Furthermore, the description of the exemplary embodiments is intended to be illustrative rather than limiting the scope of the claims, and many alternatives, modifications and variations will be apparent to those skilled in the art. BRIEF DESCRIPTION OF THE DRAWINGS
[00012] The above and/or other aspects will become more evident upon description of certain exemplary embodiments, with reference to the accompanying drawings, in which: Figure 1 is a block diagram illustrating a stereoscopic sound reproduction equipment according to a exemplary modality; Figure 2 is a block diagram illustrating a depth-of-sound information obtaining unit according to an exemplary embodiment; Figure 3 is a block diagram illustrating stereoscopic sound reproduction equipment providing stereophonic sound by using a two-channel sound signal, according to an exemplary embodiment; Figures 4A, 4B, 4C and 4D illustrate examples of providing a stereophonic sound according to an exemplary embodiment; Figure 5 illustrates a flowchart illustrating a method of generating sound depth information based on a sound signal, in accordance with an exemplary embodiment; Figures 6A, 6B, 6C and 6D illustrate an example of the generation of depth of sound information from a sound signal in accordance with an exemplary embodiment; and Figure 7 illustrates a flowchart illustrating a method of reproducing a stereophonic sound in accordance with an exemplary embodiment. BEST MODE FOR CARRYING OUT THE INVENTION
[00013] According to an aspect of an exemplary embodiment, a method of reproducing a stereophonic sound is provided, the method including: obtaining sound depth information denoting a distance between at least one sound object within a sound signal and a reference position; and give sound perspective to the sound object based on sound depth information.
[00014] The sound signal can be divided into a plurality of sections, and obtaining the sound depth information includes obtaining the sound depth information by comparing the sound signal in a previous section with the sound signal in a current section.
[00015] Obtaining sound depth information may include: calculating a power of each frequency range of each of the previous and current sections; determining a frequency range that has a power of a predetermined value or greater and is common to adjacent sections, as a common frequency range based on the power of each frequency range power; and obtaining sound depth information based on a difference between a common frequency band power in the current section and a common frequency band power in the previous section.
[00016] The method may further include obtaining a center channel signal that is output from the sound signal to a center speaker, and wherein calculating a power includes calculating a power of each frequency band power with base on the center channel signal.
[00017] The sound perspective provided can include adjusting the power of the sound object based on the sound depth information.
[00018] The sound perspective provided can include adjusting a gain and a time delay of a reflection signal that is generated when the sound object is reflected, based on the sound depth information.
[00019] The sound perspective provided can include adjusting a size of a lowband component of the sound object based on sound depth information.
[00020] The sound perspective provided may include adjusting a phase distance between a phase of a sound object to be emitted from a first speaker and a phase of a sound object that is to be emitted from a second speaker.
[00021] The method may further include outputting the sound object, for which perspective is provided, using a left-side surround speaker and a right-side ambient speaker, or using a left-side front speaker and a left-side front speaker. right side.
[00022] The method may further include placing a sound stage on one side of a speaker using the sound signal.
[00023] According to another aspect of an exemplary embodiment, stereophonic sound reproduction equipment is provided including: an information obtaining unit that obtains sound depth information denoting a distance between at least one sound object within a sound signal and a reference position; and a perspective provision unit providing sound perspective to the sound object based on sound depth information. MODE FOR THE INVENTION
[00024] Certain exemplary embodiments are described in greater detail below with reference to the accompanying drawings.
[00025] In the following description, like drawing reference numerals are used for like elements, even in different drawings. Matters defined in the description, such as detailed construction and elements, are provided to aid in a comprehensive understanding of exemplary modalities. However, exemplary modalities can be practiced without those specifically defined subjects.
[00026] First, the terms used in the exemplary embodiments are described for convenience of description.
[00027] A sound object refers to each sound element included in a sound signal. In a sound signal, several sound objects can be included. For example, in a sound signal generated by recording the current scene of a performance by an orchestra, various sound objects generated from various musical instruments such as a guitar, violin, oboe, etc. are included.
[00028] A sound source refers to an object that has generated a sound object such as a musical instrument or a voice. In an exemplary embodiment, an object that generated a sound object and an object that is considered by the user to have generated a sound object are referred to as a sound source. For example, if an apple is flying from a screen to the user while the user is watching a movie, a sound generated by the flying apple (sound object) is included in a sound signal. The sound object can be a sound that is generated by recording the actual sound generated when the apple is being thrown, or it can be a sound reproduced from a previously recorded sound object. However, in any case, the user perceives the apple as having generated the sound object, and thus the apple is also considered as the defined sound source in an exemplary modality.
[00029] Sound depth information is information denoting a distance between a sound object and a reference position. In detail, sound depth information refers to a distance between a position where a sound object is generated (the position of a sound source) and a reference position.
[00030] In the example described above, if an apple is flying from the screen to the user while the user is watching a movie, the distance between the sound source and the user is reduced. To effectively express the approaching apple, the position where the sound object corresponding to an image object is generated needs to be expressed as gradually approaching the object, and information to express this aspect is the sound depth information.
[00031] A reference position can include various positions such as, for example, a position of a predetermined sound source, a position of a speaker, a position of the user, etc.
[00032] Sound perspective is a type of sensation the user experiences through a sound object. When listening to a sound object, the user perceives the position where the sound object is generated, that is, the position of the sound source that generated the sound object. A perception of the distance between the position where the sound object is generated and the user's position is referred to as sound perspective.
[00033] Next, exemplary embodiments are described with reference to the attached drawings.
[00034] Figure 1 is a block diagram illustrating a stereophonic sound reproduction equipment 100 according to an exemplary modality.
[00035] The stereophonic sound reproduction equipment 100 includes a sound depth information obtaining unit 110 and a perspective provision unit 120.
[00036] Sound depth information obtaining unit 110 obtains sound depth information with respect to at least one sound object included in a sound signal. A sound generated from at least one sound source is included in a sound signal. Sound depth information refers to information that represents a distance between a position where sound is generated, for example a position of a sound source, and a reference position.
[00037] Sound depth information can refer to an absolute distance between an object and a reference position and/or a relative distance of an object with respect to a position and reference. According to another exemplary embodiment, sound depth information may refer to a variation in a distance between a sound object and a reference position.
[00038] The sound depth information obtaining unit 110 can obtain the sound depth information by analyzing a sound signal, by analyzing three-dimensional image data, or from an image depth map. In an exemplary embodiment, the description is provided on the basis of an example in which the sound depth information obtaining unit 110 obtains sound depth information by analyzing a sound signal.
[00039] The depth of sound information obtaining unit 110 obtains the depth of sound information by comparing a plurality of sections constituting a sound signal with sections adjacent to them. Various methods of dividing a sound signal into sections can be used. For example, a sound signal can be divided into a predetermined number of samples. Each divided section can be referred to as a frame or a block. An example of sound depth information obtaining unit 110 is described in detail below with reference to Figure 2.
[00040] The perspective providing unit 120 processes a sound signal based on the sound depth information so that the user can detect the sound perspective. The perspective providing unit 120 performs the operations described below to enable the user to effectively detect the perspective of sound. However, the operations performed by the perspective provision unit 120 are examples, and exemplary modalities are not limited to them.
[00041] The perspective provision unit 120 adjusts the power of a sound object based on the sound depth information. The closer a sound object is generated to the user, the greater the power of a sound object.
[00042] The perspective provision unit 120 adjusts a gain and a delay time of a reflection signal based on the sound depth information. The user hears a direct sound signal that is generated by an object being reflected by an obstacle and a reflection sound signal generated by an object being reflected by an obstacle. The reflection sound signal has a smaller amplitude than the direct sound signal and is delayed, compared to the direct sound signal, for a predetermined period of time when it arrives at a user's position. Particularly, if a sound object is generated close to the user, a reflection sound signal arrives substantially later compared to the direct sound signal.
[00043] The perspective provision unit 120 adjusts a low-band component of a sound object based on sound depth information. If a sound object is generated close to the user, the user perceives a low-band component as being wide.
[00044] The perspective provision unit 120 adjusts a phase of a sound object based on the sound depth information. The greater the difference between a phase of a sound object that must be emitted from a first speaker and a phase that must be emitted from a second speaker, the user perceives the sound object as being closer.
[00045] The detailed description of the operations of the perspective provision unit 120 is provided below with reference to Figure 3.
[00046] Figure 2 is a block diagram illustrating the sound depth information obtaining unit 110 according to an exemplary embodiment.
[00047] The sound depth information obtaining unit 110 includes a power calculation unit 210, a determination unit 220, and a generation unit 230.
[00048] The power calculation unit 210 calculates a power of a frequency band from each of the various sections that make up a sound signal.
[00049] A method of determining a size of a frequency band may vary according to exemplary modalities. Next, two methods of determining a frequency band size are described, but an exemplary modality is not limited to them.
[00050] A frequency component of a sound signal can be divided into identical frequency bands. An audible frequency range that humans can hear is 20-20000 Hz. If the audible frequency is divided into ten identical frequency bands, the size of each frequency band is approximately 200 Hz. The method of dividing a band frequency of a sound signal in identical frequency bands can be referred to as an equivalent rectangular bandwidth division method.
[00051] A frequency component of a sound signal can be divided into different sized frequency bands. Human hearing can recognize even a small frequency change when hearing a low frequency sound, but when hearing a high frequency sound, human beings cannot recognize even a small frequency change. Consequently, low-frequency bands are densely split, and high-frequency bands are roughly split, considering the sense of hearing of human beings. Thus, low frequency bands have narrow widths, and high frequency bands have wider widths.
[00052] Based on the power of each frequency band, the determining unit 220 determines a frequency band that has a power of a predetermined value or greater and is common to adjacent sections such as a common frequency band. For example, the determination unit 220 selects frequency bands that have a power of A or greater in a current section, and frequency bands that have a frequency of A or greater in a current section, and frequency bands having a power of A or greater in at least one previous section (or frequency bands having the fifth highest power in the current section or frequency bands having the fifth highest power in the previous section), and determines a frequency band that is selected from the previous section and the current section as a common frequency band. The reason why frequency bands of a predetermined value or greater are limited is to obtain a position of a sound object having a large signal amplitude. Thus, an influence from a sound object having a small signal amplitude can be minimized, and an influence from a main sound object can be maximized. Another reason why the determination unit 220 determines the common frequency band is to determine whether the new sound object, which did not exist in the previous section, is generated in the current section or whether the characteristics of a sound object that existed previously (eg a generation position) have been changed.
[00053] The 230 generation unit generates the depth of sound information based on a difference between a power of the common frequency band of the previous section and the power of the common frequency band of the current section. For convenience of description, a common frequency band is assumed to be 3000-4000 Hz. If a frequency component power of 3000-4000 Hz in the previous section is 3 W, and a frequency component power of 30004000 Hz in the current section is 4.5 W, this indicates that a common frequency band power has increased. This can be taken as an indication that a sound object of the current section is generated at a position closer to the user. That is, if a difference value of the common frequency power values between adjacent sections is greater than a threshold, this may be an indication of a change of position between the sound object and the reference position.
[00054] According to the exemplary modalities, when varying the power of the common frequency band of adjacent sections, it is determined whether there is an image object that approaches the user, that is, an image object that advances from a screen, based on depth map information regarding a 3D image. If an image object is approaching the user when the power of the common frequency band varies, it can be determined that the position where the sound object is generated is moving according to the motion of the image object.
[00055] The generation unit 230 can determine that the greater the power variation of the common frequency band between the previous section and the current section, the closer to the user a sound object corresponding to the common frequency band is generated in the section. current compared to a sound object corresponding to the common frequency band in the previous section.
[00056] Figure 3 is a block diagram illustrating a stereophonic sound reproduction equipment 300 providing a stereophonic sound by using a two-channel sound signal, according to an exemplary modality.
[00057] If an input signal is a multi-channel sound signal, downmixing is performed using a stereo signal, and then an exemplar mode method can be applied.
[00058] A Fast Fourier Transform (FFT) unit 310 performs an FFT.
[00059] An Inverse Fast Fourier Transform (IFFT) unit 320 performs an IFFT with respect to the signal for which the FFT is performed.
[00060] A center signal extraction unit 330 extracts a center signal corresponding to a center channel, from the stereo signal. The central signal extraction unit 330 extracts a signal which has a broad correlation, from the stereo signal. In Figure 3, it is assumed that the sound depth information is generated based on a center channel signal. However, this is an example, and sound depth information can be generated using other channel signals such as, for example, left or right front channel signals or left or right ambient channel signals.
[00061] A sound stage extension unit 350 extends one sound stage. Sound stage extension unit 350 artificially provides a time difference or a phase difference to a stereo signal so that a sound stage is located on an outer side of a speaker.
[00062] Sound depth information obtaining unit 360 obtains sound depth information based on a central signal.
[00063] A parameter calculation unit 370 determines a control parameter value that is required to provide sound perspective to a sound object based on sound depth information.
[00064] A 371 level control unit controls the amplitude of an input signal.
[00065] A 372 phase control unit adjusts a phase of an input signal.
[00066] A reflection effect provision unit 373 models a reflection signal that is generated by an input signal reflected, for example, by a wall.
[00067] A close distance effect provision unit 374 models a sound signal that is generated at a close distance from the user.
[00068] A mixing unit 380 performs the mix of at least one signal and samples it to a speaker.
[00069] Next, an operation of stereophonic sound reproduction equipment 300 in a temporal order is described.
[00070] First, when a multi-channel sound signal is input, the multi-channel sound signal is converted to a stereo signal using a downmixer (not shown).
[00071] The FFT unit 310 performs FFT with respect to a stereo signal and outputs the stereo signal to the central signal extracting unit 330.
[00072] The center signal extraction unit 330 compares the transformed stereo signals and outputs a signal that has higher correlation as a center channel signal.
[00073] Sound depth information obtaining unit 360 generates sound depth information based on the center channel signal. One method of generating the sound depth information using the sound depth information obtaining unit 360 is as described above with reference to Figure 2. That is, first, a power of each frequency band of each of the sections constituting the center channel signal is calculated, and a common frequency band is determined based on the calculated power. Then, a power variation of the common frequency band in at least two adjacent sections is measured, and a depth index is established according to the power variation. The greater the power variation of the common frequency band of adjacent sections, the more a sound object corresponding to the common frequency band needs to be expressed as approaching the user, and thus a wide depth index value is established of a sound object.
[00074] Parameter calculation unit 370 calculates a parameter that should be applied to modules for a given sound perspective based on the depth index value.
[00075] Phase control unit 371 adjusts a phase of a signal which is doubled according to the calculated parameter after doubling the center channel signal into two signals. When sound signals of different phases are reproduced using a left-side speaker and a right-side speaker, fuzzy sound may occur. The more intense the fuzzy sound, the more difficult it is for the user to accurately perceive the position where the sound object is generated. Due to this phenomenon, when a phase control method is used in conjunction with other methods of providing perspective, the effect of providing perspective can be increased. The closer the position where the sound object is generated to the user (or the faster the generating position gets closer to the user), the phase control unit 372 can establish a larger phase difference between the phases of the duplicated signals . A doubling signal having a phase adjusted is passed through the IFFT unit 320 to be transmitted to the reflection effect provision unit 373.
[00076] The reflection effect provision unit 373 models a reflection signal. If a sound object is generated away from the user, a direct sound that is transmitted directly to the user without being reflected, for example, by a wall, and a reflection sound that is generated by being reflected, for example, by a wall , has similar amplitudes and there is hardly a time difference between the direct sound and the reflected sound that reaches the user. However, if a sound object is generated close to the user, an amplitude difference between the direct sound and the reflected sound is large, and a difference at times when the direct sound and reflected sound reaching the user is large. Consequently, the closer to the user the sound object is generated, the larger it is generated, the greater the amplitude the reflection effect provision unit 373 reduces a gain value of a reflection signal and further increases a time delay or increases the amplitude of the direct sound. The reflection effect provision unit 373 transmits a center channel signal with which a reflection signal is considered to the near distance effect provision unit 374.
[00077] The near distance effect provision unit 374 models a generated sound object at a close distance to the user based on a parameter value calculated using parameter calculation unit 370. If a sound object is generated in a position close to the user, a low-band component becomes prominent. The closer the position where the sound object is generated to the user, the more the near distance effect provision unit 374 increases a lowband component of the center signal.
[00078] Sound stage extension unit 350 that received a stereo input signal processes the stereo input signal so that a sound stage of the stereo input signal is located on the outside of the speakers. If a distance between the speakers is appropriate, the user can hear stereophonic sound with presence.
[00079] Sound stage extension unit 350 transforms the stereo input signal into a stereo boost signal. The sound stage extension unit 350 may include a magnification filter that is obtained through left/right binaural synthesis convolution and a crosstalk canceller and a paranormal filter that is obtained through the convolution of a magnification filter and a direct left/right filter. The magnification filter forms a virtual sound with respect to an arbitrary position based on a head-related transfer function (HRTF) measured at a predetermined position of a stereo signal, and cancels the crosstalk of the virtual sound source based on a filter coefficient for which the HRTF is reflected. Direct left and right filters adjust signal characteristics such as, for example, a gain or delay between the original stereo signal and the virtual sound source having crosstalk cancelled.
[00080] The 360 level control unit adjusts a power value of the sound object based on a calculated depth index using the parameter calculation unit 370. The 360 level control unit can further increase the value of power of the sound object when the sound object is generated close to the user.
[00081] The mixing unit 380 combines the input stereo signal transmitted by the level control unit 360 and the central signal transmitted by the near distance effect provision unit 374.
[00082] Figures 4A to 4D illustrate examples of the provision of a stereophonic sound according to an exemplary modality.
[00083] Figure 4A illustrates a case where a stereophonic sound object according to an exemplary modality does not operate.
[00084] A user hears a sound object using at least one speaker. If the user reproduces a mono signal using a single speaker, the user cannot detect a stereo effect, but when the stereo signal is reproduced using two or more speakers, the user can detect a stereo effect.
[00085] Figure 4B illustrates a case in which a sound object whose depth index is 0 is played. With reference to Figures 4A to 4D, the depth index is assumed to have a value from 0 to 1. The closer a sound object to be expressed is to the user, to be generated, the greater a volume of the index becomes. depth.
[00086] Since the sound object's depth index is 0, an operation to give perspective to the sound object is not performed. However, by allowing a sound stage to be located on the outside of the speakers, the user can better detect a stereo effect using a stereo signal. According to an exemplary embodiment, a technique of locating a sound stage on an outside of the speakers is referred to as magnification.
[00087] Generally, multi-channel sound signals need to reproduce a stereo signal. Thus, when a mono signal is input, sound signals corresponding to at least two channels are generated by upmixing.
[00088] A stereo signal is reproduced by playing a first channel sound signal through a left side speaker, and a second channel sound signal through a right side speaker. The user can detect a stereo effect by listening to at least two sounds generated at different positions.
[00089] However, if the left side speaker and the right side speaker are placed too close to each other, the user perceives the sounds as generated in the same position and thus may not detect a stereo effect. In this case, the sound signals are processed so that the sounds are perceived as being generated not from an actual position of the speakers, but from an outside of the speakers; that is, from an area outside the speakers, such as, for example, the area surrounding the speakers or adjacent to the speakers.
[00090] Figure 4C illustrates a case in which a sound object having a depth index of 0.3 is reproduced, according to an exemplary modality.
[00091] As the sound object depth index is greater than 0, in addition to the magnification technique, perspective corresponding to the depth index of 0.3 is provided to the sound object. Consequently, the user may perceive the sound object as generated in a position closer to the user than where it is actually generated.
[00092] For example, it is assumed that the user is watching 3D image data, and an image object is expressed as being thrown off a screen. In Figure 4C, the sound perspective is given to a sound object corresponding to an image object in order to process the sound object as if it were approaching the user. The user perceives the image data as projecting and the sound object as approaching, thereby detecting a more realistic stereo effect.
[00093] Figure 4D illustrates a case in which a sound object having depth index of 1 is played.
[00094] As the sound object depth index is greater than 0, in addition to the magnification technique, the sound perspective corresponding to the depth index of 1 is given to the sound object. As the depth index of the sound object illustrated in Figure 4D is greater than that of the sound object in Figure 4C, the user can perceive the sound object as generated in a closer position than that of Figure 4C.
[00095] Figure 5 illustrates a flowchart illustrating a method of generating sound depth information based on a sound signal, according to an exemplary embodiment.
[00096] In operation S510, a power of a frequency band of each of the sections constituting a sound signal is calculated.
[00097] In S520 operation, a common frequency band is determined based on the power of each frequency band.
[00098] A common frequency band refers to a frequency band that has a power of a predetermined value or greater and is common to the previous section and the current section. Here, a frequency band having a small power can be a meaningless sound object such as, for example, noise, and thus can be excluded from the common frequency band. For example, a predetermined number of frequency bands can be selected in descending order of power values, and then a common frequency band can be determined among the selected frequency bands.
[00099] In S530 operation, the power of the common frequency band of the previous section and the power of the common frequency band of the current section are compared, and a depth index value is determined based on a result of the comparison. If the power of the common frequency band of the current section is greater than the power of the common frequency band of the previous section, it is determined that a sound object corresponding to the common frequency band should be generated in a position closer to the user. If the power of the common frequency band of the current section and the power of the common frequency band of the previous section are similar, it is determined that the sound object is not approaching the user.
[000100] Figures 6A to 6D illustrate an example of generating a sound depth information from a sound signal according to an exemplary embodiment.
[000101] Figure 6A illustrates a sound signal divided into a plurality of sections along a time axis, according to an exemplary embodiment.
[000102] Figures 6B to 6D illustrate the power of the frequency bands in the first, second and third section 601, 602 and 603. In Figures 6B to 6D, the first section 601 and the second section 602 are previous sections, and the third section 603 is a current section.
[000103] With reference to Figures 6B and 6C, in the first section 601 and in the second section 602, the powers of the frequency bands 3000-4000 Hz, 4000-5000 Hz and 5000-6000 Hz are similar. According to the frequency bands 3000-4000 Hz, 4000-5000 Hz and 5000-6000 Hz are determined as a common frequency band.
[000104] With reference to Figures 6C and 6D, assuming that the powers of the 3000-4000 Hz, 4000-5000 Hz and 5000-6000 Hz frequency bands are of a predetermined value or greater in all of the first section 601, of the second section 602 and third section 603, the frequency bands 3000-4000 Hz, 4000-5000 Hz and 5000-6000 Hz are determined as a common frequency band.
[000105] However, in the third section 603, the power of the band 5000-6000 Hz is substantially increased compared to the power of the frequency band and 5000-6000 Hz in the second section 602. Thus, a depth index of an object of sound corresponding to the 5000-6000 Hz frequency band is decided to be 0 or greater. According to an exemplary embodiment, an image depth map can be referred to to decide the depth index of the sound object.
[000106] For example, the power of the 5000-6000 Hz frequency band is substantially increased in the third section 603 compared to that in the second section 602. Under the circumstances, this may be the case where the position in which the Sound object corresponding to the frequency band of 5000-6000 Hz is generated not approached by the user, but only a power value is increased in the same position. Here, if there is an image object advancing from a screen into an image frame corresponding to the third section 603 when referring to the image depth map, the possibility that the sound object corresponding to the 5000 frequency band -6000 Hz corresponds to an image object can be high. In this case, the position where the sound object is generated gradually approaches the user and thus a depth index of the sound object is set to be 0 or greater. On the other hand, if there is no image object projecting out of a screen in an image frame corresponding to the third section 603, it can be considered that only the power of the sound object has increased while the same position is maintained and thus , the sound object depth index can be set to 0.
[000107] Figure 7 is a flowchart illustrating a method of reproducing a stereophonic sound according to an exemplary modality.
[000108] In S710 operation, sound depth information is obtained. Sound depth information refers to information representing a distance between at least one sound object within a sound signal and a reference position.
[000109] In S720 operation, sound perspective is given to a sound object based on sound depth information. Operation S720 can include at least one of operations S721 and S722.
[000110] In S721 operation, a power gain of the sound object is adjusted based on the sound depth information.
[000111] In S722 operation, a gain and a delay time of a reflection signal generated as a sound object is reflected by an obstacle are adjusted based on the sound depth information.
[000112] In S723 operation, a low-band component of the sound object is adjusted based on the sound depth information.
[000113] In S724 operation, a phase difference between a phase of a sound object to be output from a first speaker and a phase of a sound object that is to be output from a second speaker is set.

权利要求:
Claims (11)
[0001]
1. METHOD OF REPRODUCING A STEREOPHONIC SOUND, the method characterized by comprising: dividing a sound signal into a plurality of adjacent sections along a time axis; splitting a sound signal into a plurality of frequency bands; calculate a plurality of powers respectively corresponding to the plurality of frequency bands of each of the previous and current sections (S510); determining a frequency band that has a power of a predetermined value or greater and is common to adjacent sections, such as a common frequency band based on the calculated powers (S520); minus one sound object within a sound signal and a reference position, based on a difference between the power of the common frequency band in the current section and the power of the common frequency band in the previous section (S530), and provide a sound perspective for the sound object emitted from a speaker, based on sound depth information (S720).
[0002]
The method of claim 1, further comprising: obtaining a center channel signal that is output from the sound signal to a center speaker, and wherein calculating the power comprises calculating the power of each band. frequency based on the center channel signal.
[0003]
A method according to claim 1, characterized in that providing the sound perspective comprises: adjusting the power of the sound object based on the sound depth information (S721).
[0004]
4. Method according to claim 1, characterized in that providing the sound perspective comprises: adjusting a gain and a delay time of a reflection signal that is generated when the sound object is reflected, based on information from sound depth (S721).
[0005]
The method of claim 1, characterized in that providing the sound perspective comprises: adjusting a size of a lowband component of the sound object based on sound depth information (S723).
[0006]
6. Method according to claim 1, characterized in that the provision of sound perspective comprises: adjusting a phase difference between a phase of the sound object to be emitted from a first speaker and a phase of a sound object which must be output from a second speaker (S724).
[0007]
7. Method according to claim 1, characterized in that the provision of the sound perspective comprises: emitting the sound object, for which the perspective is provided, using a left-side ambient speaker and a right-side ambient speaker or using a front left speaker and a front right speaker.
[0008]
8. Method according to claim 1, characterized in that the provision of the sound perspective comprises: locating a sound stage in an external area of a speaker using the sound signal.
[0009]
9. STEREOPHONIC SOUND REPRODUCTION EQUIPMENT, characterized in that it comprises: an information acquisition unit (100) which obtains sound depth information, which denotes a distance between at least one sound object within a sound signal and a position reference, by analyzing the sound signal by dividing the sound signal into a plurality of adjacent sections along a time axis and dividing the sound signal into a plurality of frequency bands; and a perspective providing unit (120) which provides sound perspective to the sound object based on sound depth information; wherein the information obtaining unit (110) comprises: a power calculation unit (210) which calculates a plurality of powers respectively corresponding to the plurality of frequency bands of each of the previous and current sections; a determining unit (220) that determines a frequency band that has a power of a predetermined value or greater and is common to adjacent sections, as a common frequency band based on the calculated frequency of each frequency band; and a generation unit (230) that generates the depth of sound information based on a difference between a power of the common frequency band in the current section and a power of the common frequency band in the previous section.
[0010]
10. Stereophonic sound reproduction equipment, according to claim 9, characterized in that it further comprises: a signal acquisition unit that obtains a center channel signal that is emitted from the sound signal to a center speaker, and in that the power calculation unit (210) calculates the power of each frequency band on the basis of a channel signal corresponding to the center channel signal.
[0011]
11. NON TRANSIENT LEGIBLE RECORDING MEANS, characterized by having incorporated in it instructions that, when executed, cause the method of any one of claims 1 to 8 to be executed.

类似技术:

公开号 | 公开日 | 专利标题

BR112012028272B1|2021-07-06|method of reproducing a stereophonic sound, stereophonic sound reproduction equipment, and non-transient computer readable recording medium

JP5944840B2|2016-07-05|Stereo sound reproduction method and apparatus

EP3188513B1|2020-04-29|Binaural headphone rendering with head tracking

BR112013017070B1|2021-03-09|AUDIO SYSTEM AND OPERATING METHOD FOR AN AUDIO SYSTEM

EP2737727B1|2017-01-04|Method and apparatus for processing audio signals

KR101540911B1|2015-07-31|A method for headphone reproduction, a headphone reproduction system, a computer program product

JP2021513261A|2021-05-20|How to improve surround sound localization

JP2022042806A|2022-03-15|Speech processing equipment and programs

US10887717B2|2021-01-05|Method for acoustically rendering the size of sound a source

Riener et al.2012|Auditory Aspects

Carty2008|Artificial Simulation of Audio Spatialisation: Developing a Binaural System

Vorländer2008|3-D sound reproduction and virtual reality systems

同族专利:

公开号 | 公开日

WO2011139090A3|2012-01-05|

ZA201209123B|2017-04-26|

US20150365777A1|2015-12-17|

AU2011249150B2|2014-12-04|

EP2561688B1|2019-02-20|

EP2561688A4|2015-12-16|

AU2011249150A1|2012-12-06|

CA2798558A1|2011-11-10|

US9148740B2|2015-09-29|

RU2012151848A|2014-06-10|

EP2561688A2|2013-02-27|

CN102972047A|2013-03-13|

US9749767B2|2017-08-29|

RU2540774C2|2015-02-10|

WO2011139090A2|2011-11-10|

US20110274278A1|2011-11-10|

BR112012028272A2|2016-11-01|

KR20110122631A|2011-11-10|

MX2012012858A|2013-04-03|

JP2013529017A|2013-07-11|

CA2798558C|2018-08-21|

CN102972047B|2015-05-13|

KR101764175B1|2017-08-14|

JP5865899B2|2016-02-17|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

DE19735685A1|1997-08-19|1999-02-25|Wampfler Ag|Non contact electrical energy transmission device for personal vehicle|

US6504934B1|1998-01-23|2003-01-07|Onkyo Corporation|Apparatus and method for localizing sound image|

KR19990068477A|1999-05-25|1999-09-06|김휘진|3-dimensional sound processing system and processing method thereof|

RU2145778C1|1999-06-11|2000-02-20|Розенштейн Аркадий Зильманович|Image-forming and sound accompaniment system for information and entertainment scenic space|

WO2001080564A1|2000-04-13|2001-10-25|Qvc, Inc.|System and method for digital broadcast audio content targeting|

US6829018B2|2001-09-17|2004-12-07|Koninklijke Philips Electronics N.V.|Three-dimensional sound creation assisted by visual information|

RU23032U1|2002-01-04|2002-05-10|Гребельский Михаил Дмитриевич|AUDIO TRANSMISSION SYSTEM|

KR100626661B1|2002-10-15|2006-09-22|한국전자통신연구원|Method of Processing 3D Audio Scene with Extended Spatiality of Sound Source|

EP1552724A4|2002-10-15|2010-10-20|Korea Electronics Telecomm|Method for generating and consuming 3d audio scene with extended spatiality of sound source|

GB2397736B|2003-01-21|2005-09-07|Hewlett Packard Co|Visualization of spatialized audio|

RU2232481C1|2003-03-31|2004-07-10|Волков Борис Иванович|Digital tv set|

KR100677119B1|2004-06-04|2007-02-02|삼성전자주식회사|Apparatus and method for reproducing wide stereo sound|

JP2006128816A|2004-10-26|2006-05-18|Victor Co Of Japan Ltd|Recording program and reproducing program corresponding to stereoscopic video and stereoscopic audio, recording apparatus and reproducing apparatus, and recording medium|

KR100688198B1|2005-02-01|2007-03-02|엘지전자 주식회사|terminal for playing 3D-sound And Method for the same|

US20060247918A1|2005-04-29|2006-11-02|Microsoft Corporation|Systems and methods for 3D audio programming and processing|

JP4835298B2|2006-07-21|2011-12-14|ソニー株式会社|Audio signal processing apparatus, audio signal processing method and program|

KR100922585B1|2007-09-21|2009-10-21|한국전자통신연구원|SYSTEM AND METHOD FOR THE 3D AUDIO IMPLEMENTATION OF REAL TIME e-LEARNING SERVICE|

KR101415026B1|2007-11-19|2014-07-04|삼성전자주식회사|Method and apparatus for acquiring the multi-channel sound with a microphone array|

KR100934928B1|2008-03-20|2010-01-06|박승민|Display Apparatus having sound effect of three dimensional coordinates corresponding to the object location in a scene|

JP5274359B2|2009-04-27|2013-08-28|三菱電機株式会社|3D video and audio recording method, 3D video and audio playback method, 3D video and audio recording device, 3D video and audio playback device, 3D video and audio recording medium|

KR101690252B1|2009-12-23|2016-12-27|삼성전자주식회사|Signal processing method and apparatus|KR101717787B1|2010-04-29|2017-03-17|엘지전자 주식회사|Display device and method for outputting of audio signal|

JP2012151663A|2011-01-19|2012-08-09|Toshiba Corp|Stereophonic sound generation device and stereophonic sound generation method|

JP5776223B2|2011-03-02|2015-09-09|ソニー株式会社|SOUND IMAGE CONTROL DEVICE AND SOUND IMAGE CONTROL METHOD|

FR2986932B1|2012-02-13|2014-03-07|Franck Rosset|PROCESS FOR TRANSAURAL SYNTHESIS FOR SOUND SPATIALIZATION|

EP2871842A4|2012-07-09|2016-06-29|Lg Electronics Inc|Enhanced 3d audio/video processing apparatus and method|

CN103686136A|2012-09-18|2014-03-26|宏碁股份有限公司|Multimedia processing system and audio signal processing method|

EP2733964A1|2012-11-15|2014-05-21|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Segment-wise adjustment of spatial audio signal to different playback loudspeaker setup|

KR20210141766A|2013-07-31|2021-11-23|돌비 레버러토리즈 라이쎈싱 코오포레이션|Processing spatially diffuse or large audio objects|

KR102226420B1|2013-10-24|2021-03-11|삼성전자주식회사|Method of generating multi-channel audio signal and apparatus for performing the same|

CN104683933A|2013-11-29|2015-06-03|杜比实验室特许公司|Audio object extraction method|

CN105323701A|2014-06-26|2016-02-10|冠捷投资有限公司|Method for adjusting sound effect according to three-dimensional images and audio-video system employing the method|

US10163295B2|2014-09-25|2018-12-25|Konami Gaming, Inc.|Gaming machine, gaming machine control method, and gaming machine program for generating 3D sound associated with displayed elements|

US9930469B2|2015-09-09|2018-03-27|Gibson Innovations Belgium N.V.|System and method for enhancing virtual audio height perception|

CN108806560A|2018-06-27|2018-11-13|四川长虹电器股份有限公司|Screen singing display screen and sound field picture synchronization localization method|

US11032508B2|2018-09-04|2021-06-08|Samsung Electronics Co., Ltd.|Display apparatus and method for controlling audio and visual reproduction based on user's position|

法律状态:
2018-12-26| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2019-10-22| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2020-04-22| B07A| Application suspended after technical examination (opinion) [chapter 7.1 patent gazette]|

2021-03-09| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|

2021-06-15| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2021-07-06| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 04/05/2011, OBSERVADAS AS CONDICOES LEGAIS. PATENTE CONCEDIDA CONFORME ADI 5.529/DF, QUE DETERMINA A ALTERACAO DO PRAZO DE CONCESSAO. |

优先权:

申请号 | 申请日 | 专利标题

US33098610P| true| 2010-05-04|2010-05-04|

US61/330,986|2010-05-04|

KR1020110022451A|KR101764175B1|2010-05-04|2011-03-14|Method and apparatus for reproducing stereophonic sound|

KR10-2011-0022451|2011-03-14|

PCT/KR2011/003337|WO2011139090A2|2010-05-04|2011-05-04|Method and apparatus for reproducing stereophonic sound|

[返回顶部]